X 



Protection against link errors and failures using 

network coding 

Shizheng Li and Aditya Ramamoorthy 



O 

(N 

Q . Abstract 

. We propose a network-coding based scheme to protect multiple bidirectional unicast connections 

against adversarial errors and failures in a network. The end nodes of the bidirectional connections 
are connected by a set of shared protection paths that provide the redundancy required for protection. 
Q I Suppose that rig paths are corrupted by the omniscient adversary. Under our proposed protocol, the 

errors can be corrected at all the end nodes with Arie protection paths. More generally, if there are 
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' adversarial errors and n / failures, Aue + '2nf protection paths are sufficient. The number of protection 
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■ paths only depends on the number of errors and failures being protected against and is independent of 

(N ■ the number of unicast connections. 
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I. Introduction 

Protection of networks against faults and errors is an important problem. Networks are subject 
to various fault mechanisms such as link failures, adversarial attacks among others and need to 
be able to function in a robust manner even in the presence of these impairments. In order 
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to protect networks against these issues, additional resources, e.g., spare source-terminal paths 
are usually provisioned. A good survey of issues in network protection can be found in [|T||. 
Recently, the technique of network coding |l2l was applied to the problem of network protection. 
The protection strategies for link-disjoint connections in |I3]|, flU, [|5l perform network coding over 
p-Cycles jU, which are shared by connections to be protected. The work in |I71, |[8l uses paths 
instead of cycles to carry coded data units and proposes a simple protocol that does not require 
any synchronization among network nodes, yet protecting multiple primary path connections 
with shared protection paths. These schemes deal exclusively with link failures, e.g., due to fiber 
cuts in optical networks, and assume that each node knows the location of the failures at the time 
of decoding. In this work we consider the more general problem of protection against errors. 
An error in the network, refers to the alteration of the transmitted data unit in some manner 
such that the nodes do not know the location of the errors before decoding. If errors over a 
link are random, classical error control codes [19] that protect individual links may be able 
to help in recovering data at the terminals. However, such a strategy will in general not work 
when we consider adversarial errors in networks. An adversary may be limited in the number 
of links she can control. However for those links, she can basically corrupt the transmission 
in any arbitrary manner. An error correction code will be unable to handle a computationally 
unbounded adversary who knows the associated generator matrix and the actual codes under 
transmission. This is because she can always replace the actual transmitted codeword by another 
valid codeword. 

In this paper we investigate the usage of network coding over protection paths for protection 
against adversarial errors. Protection against link failures in network-coded multicast connections 
was discussed in [|9l . The problem of network error correction in multicast has been studied to 
some extent. Bounds such as Hamming bound and Singleton Bound in classical coding theory 
are generalized to network multicast in ifTOll . ifTTI . Several error correction coding schemes are 
proposed, e.g., lfT2ll . lfT3l . [[T4| . [[TSl . However, these error correction schemes work in the context 
of network-coded multicast connections. 
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In this work we attempt to simultaneously protect multiple unicast connections using network 
coding by transmitting redundant information over protection paths. Note that even the error-free 
multiple unicast problem under network coding is not completely understood given the current 
state of the art lfT6l . Therefore we consider the multiple unicast problem under certain restrictions 
on the underlying topology. In our work we consider each individual unicast to be operating 
over a single primary path. Moreover, we assume that protection paths passing through the end 
nodes of each unicast connection have been provisioned (see Figure \T\ for an example). The 
primary and protection paths can be provisioned optimally by integer linear programming (ILP). 
Although the ILP has high (potentially exponential) computational complexity, it only needs to 
run once before the transmission of data and there are powerful ILP solvers, e.g. CPLEX, to 
solve ILP problems. 

Our work is a significant generalization of |I3. We assume the omniscient adversary model 
lfT3l . under which the adversary has full knowledge of all details of the protocol (encod- 
ing/decoding algorithms, coefficients, etc.) and has no secrets hidden from her. An adversary 
changes data units on several paths, which may be primary paths or protection paths. The number 
of errors equals the number of paths the adversary attacks. If multiple paths share one link and 
the adversary controls that link, it is treated as multiple errors. Our schemes enable all nodes 
to recover from Ue errors, provided that 4ne protection paths are shared by all the connections. 
More generally, if there are Ue adversarial errors and nj failures, a total of 4ne + 2nj protection 
paths are sufficient. We emphasize that the number of protection paths only depends on the 
number of errors and failures being protected against and is independent of the number of 
unicast connections. Simulation results show that if the number of primary paths is large, the 
proposed protection scheme consumes less network resources compared to the 2+1 protection 
scheme, where 2+1 means that we use two additional paths to protect each primary connection. 

Section HI] introduces the network model and our encoding protocol, which is a generalization 
of [|7l. The error model is explained in Section Hill In Section |IVl we present the decoding 
algorithm and conditions when a single error happens. Generalizations to multiple errors and 
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combinations of errors and failures are considered in Section |V] and Section |VIl In Section IVIIl 
we briefly show how the optimal primary and protection paths are provisioned by integer linear 
programming and the simulation shows that our proposed approach saves network resources. 
Section IVIIII concludes the paper. 

II. Network model and encoding protocol 

Suppose that 2n nodes in the network establish n bidirectional unicast connections with the 
same capacity. These nodes are partitioned into two disjoint sets S and T such that each node in 
S connects to one node in T. The n connections are labeled by numbers 1, . . . , n and the nodes 
participating in the ith connection are given index i, i.e., Si and Tj. Each connection contains 
one bidirectional primary path Si — Ti. Si and Tj send data units they want to transmit onto the 
primary path. The data unit sent from Si to Tj (from to Si) on the primary path is denoted 
by di (Ui). The data unit received on the primary path by Ti (Si) is denoted by di (Ui). 

A protection path P is a bidirectional path going through all 2n end nodes of the n connections. 
It has the same capacity as the primary paths and consists of two unidirectional paths S and 
T in opposite directions. M protection paths are used and we assume that there are enough 
resources in the network so that these protection paths can always be found and provisioned. 
In this paper we mainly focus on the case where all protection paths pass through all 2n end 
nodes of the connections, see Fig. [T]for an example, and they are denoted by P^^\ . . . ,P^^^\ 
The order in which the protection paths pass through the end nodes does not matter. The more 
general case where different primary path connections are protected by different protection paths 
will be discussed in Section HV-FI All operations are over the finite field GF(q), q = 2*", where 
r is the length of the data unit in bits. Frequently used notations in this paper are summarized 
in Table H 

The system works in rounds. Time is assumed to be slotted. Each data unit is assigned a 
round number. In each round a new data unit di or Ui is transmitted by node Si or Tj on its 
primary path. In addition, it also transmits an appropriately encoded data unit in each direction 
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on the protection path. The encoding operation is executed by each node in S and T, where 
all nodes have sufficiently large buffers. The encoding and decoding operations only take place 
between data units of the same round. When a node is transmitting and receiving data units of 
certain round on the primary path, it is receiving data units of earlier rounds from the protection 
paths. The nodes use the large, though bounded-size buffer to store the transmitted and received 
data units for encoding and decoding. Once the encoding and decoding for a certain round is 
done, the data units of that round can be removed from the buffer. Overall, this ensures that the 
protocol works even when there is no explicit time synchronization between the transmissions. 
Each connection Si — Ti has 2M encoding coefficients: a-^"*, . . . , a-*^"*, . . . , where 



a,- 



and Z?!'^^ are used for encoding on protection path P*^''^ Each protection path uses the same 



protocol but different coefficients in general. The coefficients are assumed to be known by the 
end nodes before the transmission. We specify the protocol for protection path P'^''', which 
consists of two unidirectional paths S'^'^^ and T^''\ We first define the following notations. 

• a{Si) / a{T.i): the next node downstream from Si (respectively Ti) on S^''\ cr~^(S'j)/a~^(Tj): 
the next node upstream from Si (respectively Ti) on S*^'''^ (see example in Fig. [T]). 

. T{Si)/T(Ti): the next node downstream from Si (respectively Ti) onT(>'\r~\S,)/T~\T,y. 
the next node upstream from Si (respectively 7^) on T'^'^'^ (see example in Fig. [T]). 
Each node transmits to its downstream node, the sum of the data units from its upstream node 
and a linear combination of the data units it has, on each unidirectional protection path. Consider 
the k*^ protection path P^''\ denote the data unit transmitted on link e G S'^'^^ (e G T^'^^) by Sg 
(Te). Node Si knows di,Ui, and Tj knows Uj, di. The encoding operations are as follows. 

ST,;^<7(Ti) = S^-i(T^)^T, + ai'^^di + l^f^Ui, and 
We focus our discussion on node T,. Once node T, receives data units over both S'^'^^ and T'^'^^ it 



adds these data units. Denote the sum as P^'^u ■ Ti gets two values S^-i {Ti)->-Ti and T^-i (^Xi)->-Ti 
from P^''^ P(^) equals 

In the absence of any errors, di = du ui = ui for all /, most terms cancel out because the addition 
operations are performed over an extension field of the binary field and P'^'^^ = af'^di + p\^^Ui. 
Similar expressions can be derived for the other end nodes. See Fig. [T] for an example of the 
encoding protocol. 

III. Error Model 

If the adversary changes data units on one (primary or protection) path, an error happens. If 
the adversary controls a link through which multiple paths pass, or the adversary controls several 
links, multiple errors occur. We assume that the adversary knows the communication protocols 
described above, including the encoding/decoding function and encoding coefficients. There are 
no secrets hidden from her. If a primary or protection path is under the control of an adversary, 
she can arbitrarily change the data units in each direction on that path. If di ^ di or Ui ^ Ui (or 
both), we say that there is an error on primary path Si — Ti with error values e^. = di + di 
and = Ui + Ui. As for protection path error, although the error is bidirectional, we shall see 
that each node will see only one error due to the nature of the encoding protocol. In fact, even 
multiple errors on the same protection path can be shown to only have an aggregate effect as 
one error at one node. This is because from one protection path, only the sum (P^^^) of data 
units from two directions is used in decoding at a node. If this data unit is changed due to 
several errors, it can be modeled as one variable Cp^ at the node. However, different nodes will 
have different values of Cp^. in general. If there is a primary path failure (as opposed to error) 
on Si — Ti, we have di = Ui = 0. i.e. failures are not adversarial. If a protection path fails, 

'The values of P*-*' are different at different end nodes. Here we focus our discussion on node T;. To keep tlie notation 
simple, we use P^''^ instead of P^'' 
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it becomes useless and the end nodes ignore the data units on that path. All nodes know the 
locations of failures but do not know the locations of errors. 

When there are errors in the network, the error terms will not cancel out in ([I]) and Tj obtains 



^di + Pi'^^eui) + oil protection path P^'^^ where 



I\i = {1, . . . , n}\{i}, the index set excluding i, and e^^^ is the error on protection path P'^'^^ seen 
by Tj. Note that since Tj knows Ui, we can subtract it from this equation. Together with the data 
unit Pm from the primary path, Tj has the following data units. 



pik)' ^ pik) _ 



(k) 



a 



(k) 



di + /3f ^e„, + ^(al^'^ed, + /3f ^e„J + epi^,k = 1, 



(2) 
M (3) 



We multiply ^ by af"^ and add to the k'^'^ equation in ^ to obtain 

n 

^e,, + /3f )e.J + e,, = af + P^'Y , k = 1, 



M. 



(4) 



1=1 



This can be represented in matrix form as 



a 



(1) 



/31 



(1) 



„(i) 



a 



(2) 



(3\ 



(2) 



an ' (3n 



a 



1 ■■■ 
1 ■■■ 



AM) o{M) 



E = P. 



syni 



(5) 



ar' PT' ■■■ oLn" pr' ■■■ 1 ^ 

where the length-(2r2 + M) vector E = [cd^, e„,, . . . , ed„, e„„, Cp^, . . . , Cp^,]^ and the length-M 
vector Psyn = [af^P,n + P^^'^\ a^Pm + P^'^^\ • • • , a^^'^^Pm + P^^^yf. Analogous to classical 
coding theory, we call Psyn the syndrome available at the decoder. Denote the M x (2n + M) 
coefficient matrix of © as Hext, and denote the first 2n columns of Hext as a matrix H = 
[vi, V2, . . . , V2„], where Vj is the j*^ column of H. Then V2i_i, are the columns consisting 
of encoding coefficients a/s and /3j's for the connection Si — Ti. The last M columns of Hext 
form an identity matrix Im-km and can be denoted column by column as [v^, . . . , v^^]. Note that 
Ti knows H and Psyn and shall attempt to decode di even in the presence of the errors. Node 
Si gets very similar equations to those at Tj. Thus we will focus our discussion on Tj. Each end 
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node uses the same decoding algorithm and works individually without cooperation and without 
synchronization. 



IV. Recovery from single error 

In this section, we focus on the case when there is only one error in the network. We first 
present the decoding algorithm and then prove its correctness under appropriate conditions. 

A. Decoding algorithm at node Ti (Si operates similarly) 
1) Attempts to solve the following system of equations 



[v2i_lV2i 



Psyn (6) 



2) If ^ has a solution (e^^, e^^.), compute di = + e^-, otherwise, di = P^ 
We show below that this algorithm works when the error happens on a primary path or on 
one of the protection paths. 

B. Condition for one primary path error correction 

In this subsection, we consider primary path error only. Define an error pattern to be the two 
columns in H corresponding to the erroneous primary path. If the error happens on Si — Ti, 
the error pattern is {v2i_i,V2j}. An error value vector corresponding to an error pattern is 
obtained by letting the error values corresponding to other n — \ primary paths to be zero. 
The error value vector corresponding to error pattern {v2i_i, V2j} is the length-2n vector Ei = 
[0, . . . , Crf;, e^j., . . . , 0]"^. Assume that e^/s and e^/s are not all zero. The case when all of them 
are zero is trivial because it implies that no error happens. 

Theorem 1: Suppose there is at most one error on a primary path. The decoding algorithm out- 
puts the correct data unit at every node if and only if the vectors in the set {v2i_i, V2j, V2j_i, V2jfL 
for all i, j = 1, ... ,n,i ^ j are linearly independent. 

Proof: First assume that the vectors in the sets {v2j_i, V2i, V2j_i, V2j} are linearly independent. 
Let Ea and E^ be error value vectors corresponding to errors happening on different primary 

^In fact, it can be viewed as the error pattern when Si — Ti, Sj — Tj are in error. 



paths Sa—Ta and Sb — Tb respectively. Suppose there exist Ea and Ei, such that HEa = HE;,, i.e., 
H{Ea + Eb) = 0. Note that the vector {Ea + Eb) has at most four error values [e^^, Cu^, e^j^, e^J 
which are not all zero and such that [ V2a-i, V2a, V2b~i,^2b ][^d.a, ^Uai ^Mfc] — 0- This implies 
{v2a-i, V2a, V26_i, V2b} are linearly dependent, which is a contradiction. Therefore, under our 
condition that {v2i_i, V2i, V2j_i, V2j} for all i, j = 1, . . . , n, z ^ j are linearly independent, there 
does not exist Ea, Eb such that HEa = HEb. This means that if we try to solve the system 
of linear equations according to every possible error value vectors Ei, . . . , E^, it either has no 
solution or its solution is the actual error in the network. The node Tj is only interested in di, 
in our decoding algorithm, it tries to solve the equations ^ according to the error value vector 
Ei. If it has a solution, the error happens on Si — Ti. The matrix [v2i_i, V2i] has rank two, so 
equations ^ have unique solution for e^j. = Pm + e^^ gives decoded di. If ^ does not have 
solution, the error is not on Si—Ti. Ti simply picks up di = Pm from the primary path Si — Tj. 

Conversely, suppose that a vector set {v2ii~i,^2ii,^2ji~i,^2ji} is linearly dependent. There 
exist Ei-^ and Ej^ such that HEi-^ = HEj^. Both equations HEi-^ = Pgyn and HEj^ = Pgyn 
have solution. Suppose the error in fact happens on Sj^ — T,^, the decoder at Tj^ can also find 
a solution to HEi^ = Pgyn and use the solution to compute di. This leads to decoding error. ■ 

If there is no error in the network, Pgyn = and solving ^ gives e^. = = 0. In order 
to make {v2i-i,^2i,^2j-i,^2j} independent, we need the length of vectors to be at least four, 
i.e., M > 4. In fact, we shall see that several coefficient assignment strategies ensure that 
four protection paths are sufficient to make the condition hold for \fi,j = 1, . . . ,n,i ^ j. The 
condition in Theorem [T] can be stated as all M x M (4 x 4) matrices of the form 

[V2»-1, V2i, V2j_l, V2j],Z, j = 1, . . . < j (7) 

have full rank. 

C. Coefficient assignment methods 

We shall introduce several ways to assign encoding coefficients, so that (|7]) has full rank. Later 
we will see these schemes also work when protection path error is possible. 
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(1) A simple scheme of coefficient assignment and implementation. Choose n non-zero distinct 
elements 71, . . . , 7„ from GF{q). For alH = 1, . . . , n, ^ = 1, af^ = 7^, /3f ^ = 1, /3f = % 
and all other coefficients are zero. It can be shown by performing Gaussian elimination that the 
matrix © has full rank as long as 7's are distinct. The minimum field size needed is q > n. 
Consider decoding at node Tj, Table HI] is a summary of the data units P^, Pgyn that Ti gets from 
primary path and protection paths under different cases. Psyi is the fc*'^ component of Psyn- The 
decoding is done as follows. If Psyl and Pgyl are both zero, then e^j = 0,V/, Tj simply pick 
di = Pm- If Psll and Psyl are both non-zero, Ti computes S = Psyl x (Pjyn)"^. If S = 7.4, the 
error happens on Si — Ti and the error value is e^^ = Ps^l, then di = + e^.. If S = 7x., the 
error happens on — T^, x ^ i, then Tj picks up di = Pm- 

Note that we only used Pm., Psyn, Psyn to decode di at Tj. However, we cannot remove paths 
p(3) p{4) because at Si we should use Pm., Psyn, Psyn to decode. 

(2) Vandermonde matrix. The second way is to choose 2n distinct elements from GF{q) : 
7„j,7^j, . . . ,7cj„,7/3„ and let encoding coefficients to be af ^ = 7a~\/3i^^ = 7^~^- The matrix 
in equation (|7]) becomes a Vandermonde matrix and has full rank. 

(3) Random choice. Besides the structured matrices above, choosing coefficients at random from 
a large field also works with high probability due to the following claim. 

Claim 7. -When all coefficients are randomly, independently and uniformly chosen from GF(q), 
for given i and j, the probability that {v2j_i, V2i, V2j_i, V2j} are linearly independent is pi = 

(1- 1/^(1 

Proof: Suppose we have chosen V2i_i, the probability that V2i is not in the span of V2i_i is 
(1 — q/q^). The probability that V2j_i is not in the span of {v2i_i,V2i} is (1 — q^/q^). The 
probability that W2j is not in the span of {v2i_i, V2j, ^2j-i} is (1 — q^ /q^)- Since the coefficients 
are chosen independently, the probability that four vectors are linearly independent is the product 
Pi, which approaches 1 when q is large. ■ 
In (|7]) we require (2) matrices to have full rank. By union bound, the probability that the linear 
independence condition in Theorem [U holds is at least 1 — (1 — pi) (2), which is close to 1 when 
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q is large. In practice, before all the transmission, we could generate the coefficients randomly 
until they satisfy the condition in Theorem \T\ Then, transmit those coefficients to all the end 
nodes in the network. During the actual transmission of the data units, the encoding coefficients 
do not change. 

D. Taking protection path error into account 

In this subsection, we take protection path errors into account. The error (assume one error 
in this section) can happen either on one primary path or one protection path. Besides n error 
value vectors Ei, . . . , En, we have M more error value vectors for the protection path error: 
[0|epi, 0, . . . , 0]^, . . . , [0|0, 0, . . . , Cpj^j]'^, where denote an all-zero vector of length 2n. Denote 
them by Ep.^ , • • • , -Ep^^ . Using a similar idea to Theorem [U we have the following: 

Theorem 2: If there is one error on one primary path or protection path, the decoding algorithm 
works for every node if and only if vectors in the sets 



are linearly independent. Note that vf is the column in Imxm in ©. 
In fact, M = 4 suffices and the three coefficient assignment methods we described in the 

previous subsection work in this case. The simple coefficient assignment strategy in Section 

llV-Cr i) enables vector sets ([8]) and ^ to be independent. The protection path error makes exact 

one component of Pgyn to be nonzero. If detects Pgyn has only one nonzero entry, it can just 

pick up the data unit from the primary path since the only error is on the protection path. 

In order to see that Vandermonde matrix also works, we shall show that the vector sets Q 

are linearly independent. Suppose that they are linearly dependent. Since V2j_i, are linearly 

independent, there exist a and b such that (take v^' for example): av2i-i + bv2i = vf. This means 

«[7ai7aJ^ + ^[7/3i7ft]^ = 0- However, this is impossible since 



{V2i_l, V2i, V2j_i, V2j},i, j = 1, 



(8) 



{V2i-1, V2i, vf},z = l,...,n,l 



1,...,M 



(9) 



det 



7ft 



7^0. 
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Therefore, {v2i_i, V2i, v^} are linearly independent. A similar argument holds for vf when 

When the coefficients are randomly chosen from GF{q), for given i and /, the probability that 
{v2i-i, V2j, vf} are linearly independent is p2 = (1 — — Considering all vector sets 

in Theorem [21 the probability of successful decoding at all nodes is at least 1 — (1 — pi)(2) — 
(1 —p2)nM, which approaches 1 when q is large. 

E. Remark 

We can compare our results with classical results in coding theory. In classical coding theory, 
in the presence of two adversarial errors, we need a code with minimum distance at least five for 
correct decoding. This means that to transmit one symbol of information, we need to transmit a 
codeword with at least five symbols. In our problem, each connection has a total of five paths 
(one primary and four protection). A single error on a bidirectional primary path induces two 
errors, one in each direction. Therefore in an approximate sense we are using almost the optimal 
number of protection paths. However, a proof of this statement seems to be hard to arrive at. It 
is important to note that the protection paths are shared so the cost of protection per primary 
path connection is small. 

F. The case when the primary paths are protected by different protection paths 

If the primary paths are protected by different protection paths, the models are similar. 
Specifically, consider node Tj and it is protected by the protection path P^., if we denote the set 
of primary paths protected by protection path P^^^ by A^(Pa,) C {1, . . . , n}, the equation obtained 
from protection path P^ by Ti is similar to ©: Y.ieN(Pk)i^l''^ ^di+ Pi''^ eui)+ep^ = af^ Pm+P^^^' ■ 
Now, Ti obtains Mi equations, where Mj is the number of protection paths protecting connection 
Si — Ti. The system of equations it gets is similar to ([5]), but the Mj x 2n coefficient matrix H 
may contain zeros induced by the network topology. If connection Si — Ti is not protected by 
Pfc, the corresponding two terms in the /cth row are zero. The identity matrix in H^xt is huxMr 
The models are similar to the case when all connections are protected by the same protection 
paths and the decoding algorithms and conditions in Theorem \T\ and [2] still work. 
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The difference comes from the coefficient assignment. H may contain some zeros depending 
on the topology. In order to make ([S]),® to be linearly independent, we can use the method of 
matrix completion [[TtI . We view the encoding coefficients in H as indeterminates to be decided. 
The matrices we require to have full rank are a collection Ch of submatrices of Hext, where 
Ch depends on the network topology. Each matrix in Ch consists of some indeterminates and 
possibly some zeros due to the topological constraints and ones coming from the last Mi columns 
of Hext- The problem of choosing encoding coefficients can be solved by matrix completion. 
A simultaneous max-rank completion of Ch is an assignment of values from GF{q) to the 
indeterminates that preserves the rank of all matrices in Ch- After completion, each matrix will 
have the maximum possible rank. Note that if H contains too many zeros, it may be not possible 
to make the matrices to have the required rank when Mj = 4. Thus, Mj = 4 is a necessary 
but not in general sufficient condition for successful recovery. It is known that choosing the 
indeterminates at random from a sufficiently large field can solve the matrix completion problem 
with high probability IfTSl . Hence, we can choose encoding coefficients randomly from a large 
field. It is clear therefore that the general case can be treated conceptually in a similar manner 
to what we discussed earlier. Thus, we shall mainly focus on the case when the protection paths 
protect all the primary paths. 

V. Recovery from multiple errors 

Our analysis can be generalized to multiple errors on primary and protection paths. Assume 
that Uc errors happen on primary paths and rip = Ue — ric errors happen on protection paths. As 
described in Section |llll a given primary path error corresponds to two specific columns in Hext 
while a protection path error corresponds to one specific column in H^xt- Recall that we view 
Hext as a set of column vectors : {vi, V2, . . . , V2„_i, V2„, v^, vf , . . . , v^^}. An error pattern is 
specified by the subset of columns of Hext corresponding to the paths in error. 

Definition 1: A subset of columns of Hext denoted as A(mi,m2) is an error pattern with 
nil errors on primary paths {ci,...,Cmi} ^ and m2 errors on protection paths 

{pi,...,Pm2} C {1,...,M} if it has the following form: A{mi,m2) = Ac{mi) U Ap{m2), 
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where A^imi) = {vacj^i, va^^, ... , V2c„^_i, V2c„. J, Q G {l,...,n},Wi = l,...,mi and 
Apim^) = {v^^,...,vP^J,pi e {l,...,M},Vz = l,...,m2. 

Note that |v4(mi, m2)\ = 2mi + m2 and the set of columns in Hext can be expressed as A{n, M). 
Although our definition of error pattern is different from the conventional definition in classical 
coding theory, we shall find it helpful for the discussion of our algorithms. 

We let A (mi, 7712) denote the family of error patterns with mi primary path errors and m2 
protection path errors (for brevity, henceforth we refer to such errors as (mi,m2) type errors). 

Definition 2: Define A(mi,m2)i, a subset of A(mi,m2), to be the family of {mi,m2) type 
error patterns such that each error pattern includes an error on primary path Si — Ti, i.e., 
yl(mi,m2) e A(mi,m2)i if and only if {v2i_i,V2i} C A(mi,m2). 

Note that |A(mi,m2)| = (j:^ CJ and |A(mi,m2).| = L""_\)CJ. Denote the family of 
error patterns including an error on Si — Ti with errors in total as: Ai(ne) = U^^^iA(nc, Ue — 
nc)i- 

Our definition of an error pattern has only specified the location of the error but not the actual 
values. An error value vector E has the following form :[erf^, e^i^, . . . , e^^, Cu^, e^^, . . . , ep^Y ■ 
Each entry of the vector corresponds to one column in Hext- An error value vector E corresponds 
to an error pattern A(mi, m2) if in E, the entries corresponding to A{n, M)\A{mi, 1712) are zero, 
while the other entries may be non-zero and are indeterminates in the decoding algorithm. We 
are now ready to present the decoding algorithm in the presence of multiple errors. 

A. Multiple errors decoding algorithm at node Ti (Si operates similarly) 

1) Try to solve the system of linear equations specified in ^ according to each error pattern 
in Ai(ne). This means for each error pattern in Ai(ne), replace -E in ([5]) by the error value 
vector, which contains the indeterminates, corresponding to the error pattern. 

2) Suppose that the decoder finds a solution to one of these system of equations. Compute 
di = Pm + Cd,, where e^^ is recovered as part of the solution. If none of these systems of 
equations has a solution, set di = Pm- 
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This algorithm requires the enumeration of all error patterns in Ai(ne) and has high compu- 
tational complexity (exponential in the number of errors). In Section |V-C[ a low complexity 
polynomial-time algorithm will be proposed under the assumption that the errors only happen 
on the primary paths. 

B. Condition for error correction 

Theorem 3: Suppose that there are at most n,. errors in the network (both primary path error 

and protection path error are possible). The result of the decoding algorithm is correct at every 

node if and only if the column vectors in A{mi, are linearly independent for all A{mi, 1122) E 

U„„n^6{o,...,ne}A(nc + <, 2n^ - (ric + <)). 

Proof: First we shall show that under the stated condition, the decoding algorithm works. 
Suppose El and E2 denote two error value vectors corresponding to error patterns in A(nc, Ue — 
He) and A(n^,ne — n'^ respectively and Ei 7^ E2. The linear independence condition in the 
theorem implies that there do not exist Ei and E2 such that HEi = HE2. To see this, suppose 
there exist such Ei and E2, then, HEgum = 0, where Esum = Ei + E2 ^ has at most ric + n'^ 
errors on primary paths and Up + n'^ = — {ric + n'J errors on protection path. These errors 
correspond to a member (which is a set of column vectors) A{nc + n'^,2ne — + n'J) G 
A(nc + n^,2ne — {uc + n'J). HE sum = contradicts the linear independence of the column 
vectors in A{nc + n'^, 2ne — (ric + n'J). Thus, Ei, E2 do not exist for HEi = HE2. This means 
that if a decoder tries to solve every system of linear equations according to every possible error 
patterns with rze errors, it either gets no solution, or gets the same solution for multiple solvable 
systems of linear equations. A decoder at Tj is only interested in error patterns in Ai(ne). If in 
step 1 it finds a solution E for one system of equation, e^- in E is the actual error value for di 
and di = Pm + e^., otherwise, no error happens on Si — Ti. 

Conversely, if there exist some Uc, n'^ such that some member in A(r2c + n'^, ^n^ — + n'J) 
is linearly dependent, there exist E[ and E2 such that HE[ = HE2 and E[ ^ E'^. This implies 
that there exists an %\ such that either e^-^ or e„.^ is different. At node Ti^ or S'jj, the decoder 
has no way to distinguish which one is the actual error value vector and the decoding fails. ■ 
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The above condition is equivalent to the fact that all vector sets y4(mi, 7712) G UmG{o,...,2ne} A(m, 2ne— 
m) are linearly independent. \A{m,2ne — m)| = 2ne + m and its maximum is AUf.. Thus, 
the length of the vectors should be at least Arie. In fact, M = Aue is sufficient under ran- 
dom chosen coefficients. Suppose that the coefficients are randomly and uniformly chosen 
from GF{q). For a fixed m, the probability that A{m,2ne — m) = Ac(m) U Ap{2ne — m) 
is linearly independent is pi(m) = riiUcT^ll ~ q^ne-m+i j ^My Considering all members in 
A(m, 2n(, — m) and all values of m, by union bound, the probability for successful decoding is 
at least 1 - Emio(l -Pi("^))(m) Ln^-m)^ wMch approaches 1 when q is large. 

C. Reed-Solomon like efficient decoding for primary path error only case 

If the errors only happen on primary paths, the condition in Theorem |3]becomes that each mem- 
ber of A(2ne, 0) is linearly independent. We can choose H so that Hij = where a is the 
primitive element over GF(q), with q > 2n. This is a parity check matrix of a (2n, 2n—M) Reed- 
Solomon code. Denote it by Hrs- Any M (M = Aue) columns of Hrs are linearly independent 
and satisfies the condition in Theorem [3l Thus, ^ becomes HRsled^.eui-, • • • , e^^, CuJ^ = Psyn, 
in which Hrs and Psyn are known by every node. The decoding problem becomes to find an error 
pattern with at most Ue errors and the corresponding error value vector. Note that in fact there 
are 2ne error values to be decided. This problem can be viewed as RS hard decision decoding 
problem while the number of errors is bounded by 2ne. Psyn can be viewed as the syndrome of 
a received message. We can apply Berlekamp-Massey algorithm (BMA) for decoding. It is an 
efficient polynomial time algorithm, while the proposed algorithm in Section has exponential 
complexity. Further details about RS codes and BMA can be found in [191. 

VI. Recovery from a combination of errors and failures 

We now consider a combination of errors and failures on primary and protection paths. Recall 
that when a primary path or a protection path is in failure, then all the nodes are assumed to be 
aware of the location of the failure. Assume that there are a total of nj failures in the network, 
such that Uf^ failures are on primary paths and nj^ = nj — Uf^ failures are on protection paths. If 
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a protection path has a failure it is basically useless and we remove the equation corresponding 
to it in error model ©. Thus, we shall mainly work with primary path failures and error model 
^ will have M' = M — nj^ equations. In our error model, when a primary path failure happens, 
(ij = (Uj = respectively). We can treat a primary path failure as a primary path error with error 
value Crf- = di (6^ = Ui respectively). In the failure-only case considered in [|7]|, n/^ protection 
paths are needed for recovery from ri/^ primary path failures. However, the coefficients are 
chosen such that a-^'' = /3f'^\Vi,A;, which violates the condition for error correction discussed 
before. Thus, we need more paths when faced with a combination of errors and failures. 

The decoding algorithm and condition in this case are very similar to multiple error case. An 
important difference is that the decoder knows the location of rij failures. To handle the case of 
failures, we need to modify some definitions in Section |Vl 

Definition 3: A subset of columns of H denoted by F{nfJ is said to be a failure pattern 
with Uf^ failures on primary paths {/i, . . . , /n^^} ^ {1, . . . , n} if it has the following form: 
^(^/c) = {v2/i-i, V2/1, . . . , V2/„^_i, V2/„^,^ },fi G {1, . . . , n}. 

Definition 4: An error/failure pattern with mi primary path errors, m2 protection path errors 
and failure pattern F(nfJ is defined as {mi,m2, F(nfJ) = A{mi,m2)\F{nfj ^ F(nfJ, 
where A{mi,m2)\F(nfj £ A(mi,m2) and is such that A{mi,m2)\F{nf^) ^ F{nfJ = 0, i.e., 
A(mi,m2)\F{nfj is a (mi, 7722) type error, of which the primary path errors do not happen on 
failed paths in F{nfJ. 

We let A^{mi,m2, F{nfJ) denote the family of error/failure patterns with mi primary path 
errors, m2 protection path errors ((mi,m2) type errors) and a fixed failure pattern F{nfJ. 

Definition 5: Define a subset of A^{mi,m2, F{nfJ), denoted as A^{mi,m2, F{nfJ)i to be 
the family of error/failure patterns such that each pattern includes an error or failure on Si — Ti, 
i.e., A^{mi,m2,F{nfJ) e A^(mi, m2, F(n/J)i if and only if {v2i_i, V2i} C A^{mi,m2, F{nfJ) 

An error/failure value vector E corresponds to an error/failure pattern A^(mi,m2, F(nfJ) if 
the entries corresponding to A(n, M)\A^ (mi, m2, F(nfJ) are zero, while the other entries may 
be non-zero. 
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A. Decoding algorithm at node Ti for combined failures and errors (Si operates similarly) 

1) Note that Tj knows the failure pattern for all primary paths F{nfJ. It tries to solve equa- 
tions of © form according to each error/failure pattern in U^^^]^A^(nc, — Uc, F{nfJ)i. 
The indeterminates are given by the error value vector corresponding to the error pattern. 

2) Suppose that the decoder finds a solution to one of these system of equations. Compute 
di = Pm + Crf.; If none of these systems of equations has a solution, set dj = Pm- 

B. Condition for errors/failures correction 

Theorem 4: Suppose there is at most rig errors and Uf^ primary path failures in the network, 

both primary path error and protection path error are possible. The decoding algorithm works 
at every node if and only if the column vectors in A (mi, 7712) are linearly independent for all 

A(mi,m2) e U„e{o,...,2n4A(n/^ + m, -m). 

Proof: The condition implies that for all nc,n'^ E {0, . . . ,ne} and all possible failure patterns 
F{nfJ, each member in A^{nc + n'^,2ne — {uc + n'J, F{nfJ) contains linearly independent 
vectors. The rest of the proof is similar to Theorem [3] and is omitted. ■ 

The maximum number of vectors contained in each such error pattern is 4ne + 2n/^. Thus, we 
need at least M' = + 2n equations in (|5]) which implies in turn that M = An^, + 2nf^+nf^. 
Since we don't know nf^,nf^ a priori, we need at least M = An^ + 2nj since in the worse case, 
all failures could happen on the primary paths. On the other hand, M = Aue + 2nf is sufficient 
under random choice of coefficients from a large enough field. 

If we restrict the errors/failures to be only on the primary paths, then the condition becomes 
each member of A{2ne + ri/, 0) is linearly independent and we can choose H to be the parity- 
check matrix of a {2n, 2n — Ane — 2nf) RS code. In error/failure value vector E, the locations of 
the failures are known. The decoding problem can be viewed as the RS hard decision decoding 
problem while the number of error values is bounded by 271^ and the number of failure values 
is bounded by 2nf. It can be done by a modified BMA ||T9l that works for errors and erasures. 



18 



VII. Simulation results and comparisons 

In this section, we shall show how our network coding-based protection scheme can save 
network resources by some simulations. Under our adversary error model, when the adversary 
controls a single link, one simple protection scheme is to provision three edge-disjoint paths 
for each primary connection, analogous to a (3,1) repetition code. This is referred to as a 2-1-1 
scheme, meaning that two additional paths are used to protect one connection. We call our 
proposed scheme 4-i-n, i.e., four additional paths are used to protect n connections. It is expected 
that when n becomes large, 4+n will use fewer resources than 2-1-1. We provisioned primary and 
protection paths for both cases and compared their cost. Our protection scheme can be used in 
different networks including optical network deployed in a large area, or any overlay network 
no matter what the underlying supporting network and the scale of the network are. 

In the simulation, we use two networks: 1) LabnetOS network for North America [|20ll , ETI 
(Figlll), 2) COST239 Network for Europe llIOll . Il22l (FigjS]). Our integer linear programming 
(ILP) for the proposed 4-i-n scheme is formulated as follows. The network topology is modelled 
as an undirected graph G = {V,E). Considering that usually there are multiple optical fibers 
between two cities, we inflate the graph G such that each edge is copied for several times (four 
times in our simulations), i.e., there are four parallel edges between the nodes. An edge (z,j) 
in G is replaced by edges {i,j)2, ihj)3, in the inflated graph. The set of unicast 

connections to be established is given in = {(^ijTi), . . . , (S'n,T„)}. In order to model the 
protection paths as flows, we add a virtual source s and a virtual sink t to the network and 
connect s and t with the end nodes of connections in N . This procedure is illustrated in Fig. [2l 
We call this inflated graph G' = (V, E'). Every edge (z, j)k connecting node i and j is associated 
with a positive number Qj, the cost of per unit flow of this link, which is proportional to the 
distance between the nodes i and j. Assume that each link has enough capacity so there is no 
capacity constraint. We hope to find the optimal 4 + n paths that satisfy appropriate constraints 
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on the topology U in the network that minimize the total cost. One protection path can be viewed 
as a unit flow from s to t, while one primary path Si — Ti can be viewed as a unit flow from Si 
to Ti. Therefore, the problem can be formulated as a minimum cost flow problem under certain 
conditions. Each edge {i,j)k is associated with 4 + n binary flow variables flj\, 1 < m < n + A, 
which equals 1 if path m passes through edge {i,j)k and otherwise. The LLP is formulated as 
follows. 

{i,j)k&E' l<m<n+4 

The constraints are such that 

1) Flow conservation constraints hold for primary paths and protection paths. 

2) Each protection path should pass through the end nodes of all the connections. 

3) The primary paths are edge-disjoint. 

4) The primary paths and the protection paths are edge-disjoint. 

5) The protection paths are edge-disjoint. 

The minimization is over Z^^^, {i,j)k G E', I < m < A + n and some auxiliary variables 
that are used to mathematically describe the constraints. We assume that when an adversary 
attacks an edge in the network she can control all paths going through that link. Thus, we have 
edge-disjoint constraints so that she only causes one path in error in the network. For detailed 
mathematical description of the constraints, please refer to [SJ to see a similar formulation. We 
call this formulation as ILPl. 

We also provision the paths for 2-1-1 scheme. The provisioning of the paths also minimizes 
the total cost, i.e., the objective is to minimize E(ij),ei?'(Ei<m<n Ei<z<3 c^/i^L), where fg^ 
is the flow variable for the /*^' path of the m*^' primary connection. Furthermore, the three paths 
for one connection should be edge-disjoint. We call this formulation as ILPl. 

'we only provision one set of protection paths for connections in Af. We could optimally partition Af into several subsets, 
each of which is protected by a set of protection paths as in [Sj. It will give us better solution but greatly complicates the ILP. 
In our simulation, the 4+n scheme shows gains under the simpler formulation. Thus, we simulate under the simpler formulation. 
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However, in general G' contains a large number of edges which result in a long computation 
time for ILPl. In order to simulate and compare efficiently, instead of solving the ILPl directly, 
we present an upper bound of the cost for our proposed 4+n scheme that can be computed much 
faster. The connection set M is chosen as follows. Instead of choosing n connections at random, 
we choose n/2 connections at random (denoted as the connection set AA) and duplicate those 
connections to obtain M. So there are two independent unicast connections between two cities. 
We remove the fifth constraint (edge-disjointness of protection paths) from ILPl and run the 
ILP instead on the original graph G for AA. We call this ILP as ILP3. Then, we modify the 
optimal solution of ILP3 properly to obtain a feasible solution of ILPl for n connections on G' . 
This is illustrated in Fig. [3l The cost of this feasible solution is an upper bound of the optimal 
cost of ILPl. And from the simulation for a small number of connections we observe that the 
bound is approximately 10% larger than the actual optimal cost. It turns out that solving ILP2 
is fast, therefore we obtain the actual optimal cost for the 2+1 scheme. 

In the simulation, we choose from 5 to 9 such that n goes from 10 to 18. The ILPs 
are solved by CPLEX. The costs for the 4+n scheme and 2+1 scheme are averaged over five 
realizations of AA . The average costs and percentage gains for different number of connections 
are presented in Table Hill and Table HVl As we expected, the gain of our proposed scheme 
increases with the number of connections. 

Intuitively, our proposed scheme will have more gain when the connections are over long 
distances, e.g., connections between the east coast and the west coast of the US. Roughly 
speaking, the number of paths crossing the long distance (inducing high cost) is 4 + n for 
our scheme, while it is 3ra for the 2+1 scheme. We also ran some simulation on Labnet03 
network to verify this by choosing the connections to cross the America continent. For a ten 
connections setting, we observed 36.7% gain. And when n = 6 and n = 7, we observed up to 
15.5% and 17.8% gains respectively. We conclude that our 4+n scheme is particularly efficient 
in allocating network resources when the primary paths are over long distances or have high 
cost. 
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VIII. Conclusions and Future Work 

In this paper we considered network coding based protection strategies against adversarial 
errors for multiple unicast connections that are protected by shared protection paths. Each unicast 
connection is established over a primary path and the protection paths pass through the end nodes 
of all connections. We demonstrated suitable encoding coefficient assignments and decoding 
algorithms that work in the presence of errors and failures. We showed that when the adversary 
is introducing errors, which may be on primary paths or protection paths, An^ protections 
are sufficient for data recovery at all the end nodes. More generally, when there are rig errors 
and Uf failures on primary or protection paths, 4?7,e + 2nf protection paths are sufficient for 
correct decoding at all the end nodes. Simulations show that our proposed scheme saves network 
resources compared to the 2+1 protection scheme, especially when the number of primary paths is 
large or the costs for establishing primary paths are high, e.g., long distance primary connections. 

Future work includes investigating more general topologies for network coding-based protec- 
tion. The 2-1-1 scheme can be viewed as one where there is usually no sharing of protection 
resources between different primary connections, whereas the 4+n scheme enforces full shar- 
ing of the protection resources. Schemes that exhibit a tradeoff between these two are worth 
investigating. It is also interesting to further examine the resource savings when we partition 
the primary paths into subsets and provision protection resources for each subset separately. 
Furthermore, in this paper we considered an adversarial error model. When errors are random, 
we could use classical error control codes to provide protection. But it is interesting to consider 
schemes that combine channel coding across time and the coding across the protection paths in 
a better manner. A reviewer has pointed out that rank metric codes [[T5l might be also useful 
for this problem. 

References 

[1] D.Zhou and S.Subramaniam, "Survivability in optical networks," IEEE Network, vol. 14, pp. 16-23, Nov./Dec. 2000. 
[2] R. Ahlswede, N. Cai, S.-Y. Li, and R. W. Yeung, "Network Information Flow," IEEE Trans, on Info. Th., vol. 46, no. 4, 
pp. 1204-I2I6, 2000. 

22 



[3] A. Kamal, "1+n protection in optical mesii networks using network coding on p-cycles," in IEEE Glohecom, 2006. 

[4] , "1+n protection against multiple faults in mesh networks," in IEEE Intl. Conf. on Commu. (ICC), 2007. 

[5] A. E. Kamal, "1+n network protection for mesh networks: Network coding-based protection using p-cycles," IEEE/ACM 

Transactions on Networking, vol. 18, no. 1, pp. 67 ~80, Feb. 2010. 
[6] D. Stamatelakis and W. D. Grover, "IP layer restoration and network planning based on virtual protection cycles," IEEE 

Journal on Selected Areas in Communications, vol. 18, no. 10, pp. 1938-1949, 2000. 
[7] A. E. Kamal and A. Ramamoorthy, "Overlay protection against link failures using network coding," in 42nd Conf. on Info. 

Sci. and Sys. (CISS), 2008. 

[8] A. E. Kamal, A. Ramamoorthy, L. Long, and S. Li, "Overlay Protection Against Link Failures Using Network Coding, 

submitted to IEEE/ACM Trans, on Networking," 2009. 
[9] R. Koetter and M. Medard, "An Algebraic Approach to Network Coding," IEEE/ACM Transactions on Networking, vol. 

11, no. 5, pp. 782-795, 2003. 

[10] R. W. Yeung and N. Cai, "Network error correction. Part 1: Basic concepts and upper bounds," Comm. in Info, and Sys., 
pp. 19-36, 2006. 

[11] N. Cai and R. W. Yeung, "Network error coiTection, Part IL Lower bounds," Comm. in Info, and Sys., pp. 37-54, 2006. 
[12] Z. Zhang, "Linear network error correction codes in packet networks," IEEE Trans, on Info. Th., vol. 54, no. 1, pp. 
209-218, Jan. 2008. 

[13] S. Jaggi, M. Langberg, S. Katti, T. Ho, D. Katabi, M. Medard, and M. Effros, "Resilient network coding in the presence 

of Byzantine adversaries," in IEEE INFOCOM, 2007, pp. 616-624. 
[14] S. Yang, R. W. Yeung, and C. K. Ngai, "Refined coding bounds and code constructions for coherent network error 

correction," preprint. 

[15] D. Silva, F. Kschischang, and R. Koetter, "A rank-metric approach to error control in random network coding," IEEE 

Trans, on Info. Th., vol. 54, no. 9, pp. 3951 -3967, Sept. 2008. 
[16] Z. Li and B. Li, "Network coding: the case of multiple unicast sessions," in Proc. of the 42nd Allerton Annual Conference 

on Communication, Control, and Computing, 2004. 
[17] N. J. A. Harvey, D. R. Karger, and K. Murota, "Deterministic network coding by matrix completion," in SODA '05: 

Proceedings of the sixteenth annual ACM-SIAM .symposium on Discrete algorithms. Philadelphia, PA, USA: Society for 

Industrial and Applied Mathematics, 2005, pp. 489^98. 
[18] L.Lovasz, "On determinants, matchings and random algorithms," in Fund. Comput. Theory 79, Berlin, 1979. 
[19] S. Lin and D. J. Costello, Error control coding: fundamentals and applications. Prentice Hall, 2004. 
[20] M. Menth and R. Martin, "Network resilience through multi-topology routing," in Proc. of 5th Intl. Workshop on Design 

of Reliable Communication Networks, 2005, pp. 271-277. 
[21] U. Walter, "Autonomous optimization of Next Generation Networks," in 2nd International Workshop on Self-Organizing 

Systems, Sep. 2007. 

[22] A. Kodian and W. Grover, "Failure-independent path-protecting p-cycles: efficient and simple fully preconnected optical- 
path protection," Journal of Lightwave Technology, vol. 23, no. 10, pp. 3241- 3259, 2005. 



23 



SiT« 



r +«2i?2 + A2«2 + "3<?3 + /%"3 -r 




aidi + /3ifii + a2<'2 + 
4- Q3<^3 + fe«3 



-\-0t2d2 + .^^*2 

5 + ,%«3 + Q^b_+)*S*{2 T2 
> < ^ = ^ 



T(k) /' 



Fig. 1. Three primary patiis Si ~ Ti,i — 1, . . . , 3 being protected by a single protection path P''''. The clockwise direction 
is S'*"' and the counter clockwise direction is T'''^ cr(5'2) = Ta, t~'^{Ts) = T2. The encoded data units on S*'"' are labeled 
inside the protection path and the encoded data units on T'''' are labeled outside the protection path. At Ts, the data unit 

P^''"' = aidi + f3iui + 02^2 + I32U2+ aidi + /3iiti + 03^3 + + a2d2 + /32M2, if there is no error, P'*-' = 03^3 + /33U3. 



TABLE I 

Frequently used notations in this paper. 



Notation 


Meaning 


n 


The number of primary connections 


M 


The number of protection paths 


Si , Ti 


The end nodes of the i"* primary connection 


di, Ui 


The data unit sent by Si, Ti respectively 


di, Ui 


The data unit received by Ti, Si respectively 




The encoding coefficients for the i"^ primary 
connection on the fc*'' protection path 


ne,nf 


The number of errors and failures in the network 




The number of errors on the primary paths 
and the protection paths respectively 


, G-Ui^ 


The error values of di,Ui respectively 
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TABLE II 



Data obtained by Ti under the simple coefficient assignment. 





No error 


Error on Si — Ti 


Error on Sx — Tx,i ^ x 


p 

J m 


di 


di + erf. 


di 


p(i) 

p(2) 

^ syn 

p(3) 

^ syn 







lied. 




pW 

^ syn 










s 




Fig. 3. A feasible solution of ILPl is obtained from the optimal solution of ILP3. Here, Ni = {{Si,T-i), (52,12)} and 
N = {(Si,Ti),(S'2,r2),(S3,T3),(S4,r4)}, where Si = Sa.Ti = r3,S2 = Si,T2 = T4. Suppose the left graph is the 
optimal solution obtained from ILP3 on G for J\fi ■ The bold edges indicate that four protection paths pass through those edges. 
The right graph is a feasible solution of ILPl on G' . The protection paths are split into four copies of edges so that the fifth 
constraint (edge-disjointness of protection paths) hold. And the paths 5*1 — Ti, 5*2 — T2 are copied to establish 5*3 — Ts, 5*4 — 
It remains feasible because in G' there are four such paths for each connection and now we only occupy two of them. 
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(b) Link costs of Labnet03 network. 



Fig. 4. Labnet03 network with 20 nodes and 53 edges in North America. 
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Fig. 5. COST239 network with 11 nodes and 26 edges in Europe. 



26 



TABLE III 



Comparison of the average costs for Labnet03 network 



n 


Average cost for 4+n (upper bound) 


Average cost for 2+1 


Percentage gain 


10 


1826 


1916.4 


4.72% 


12 


2106.4 


2295.6 


8.24% 


14 


2339.6 


2598.8 


9.97% 


16 


2677.6 


3049.2 


12.19% 


18 


3105.2 


3660 


15.16% 



TABLE IV 



Comparison of the average costs for COST239 network 



71 


Average cost for 4+n (upper bound) 


Average cost for 2+1 


Percentage gain 


10 


1226 


1245 


1.53% 


12 


1548 


1628.4 


4.94% 


14 


1742.4 


1854 


6.02% 


16 


1810.8 


1958.4 


7.54% 


18 


1883.2 


2114.4 


10.93% 
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